STATS 32 Session 8: Reproducible research

Kenneth Tay

Oct 17, 2019

Recap of session 7

File paths and working directories

File paths and working directories

Factors

Functions for factors

All these functions are part of the forcats package, which is automatically loaded when you load the tidyverse package.

Agenda for today

Reproducible research: what & why

Reproducible research: publishing data analyses together with their data and code so that others may “reproduce” the findings.

Why reproducible research?

R scripts

R markdown

RStudio: R markdown is a document format which allows you to “weave together narrative text and code to produce elegantly formatted output.”

Made possible by the knitr package (Yihui Xie)

(Source: Vimeo)

R markdown: output (1)

R markdown: output (2)

R markdown: output (3)

R markdown: input

R markdown: more details

Surprise: (Almost) all the class material (including slides) was created with R markdown!

Quick intro to Markdown

Markdown is a simple way to convert a text document into a web file (i.e. HTML) with basic styling.

Has support for:

Markdown reference here.

To see how your Markdown (.md) document looks like in real-time, use an online Markdown editor (e.g. dillinger.io)

Today’s dataset: Airbnb listings









Optional material

Rmd workflow (basic)

  1. Edit .Rmd file in RStudio.
  2. Knit the document (either by hitting the “Knit” button or using a keyboard shortcut).
    • When you press “Knit”, the file is automatically saved.
    • Next, RStudio opens a new console, “knits” the document there, then closes that console. No code is run in your original console!
    • RStudio creates a .html file in the same folder as the .Rmd file.
  3. Preview output in the preview pane, or by opening the .html file.
    • If you want to make changes, go back to Step 1.

Common Rmd chunk options